The main goals for this week were:
look at non-abundance weighted measures of dissimilarity to see if there are real turnover events in the community throughout the season
Explore Erens MED analysis and rerun some ordinations to see if they are similar overall
Time series of phylum community composition across filter fractions and sampling stations
theme_set(theme_bw())
myord(
physeq = moth_good,
fraction = "CNA",
method = "PCoA",
distance = "bray",
title = "Full community Bray-Curtis PCoA"
)
myord(
physeq = moth_good,
fraction = "100LTR",
method = "PCoA",
distance = "bray",
title = "100um Bray-Curtis PCoA"
)
myord(
physeq = moth_good,
fraction = "53LTR",
method = "PCoA",
distance = "bray",
title = "53um Bray-Curtis PCoA"
)
myord(
physeq = moth_good,
fraction = "3NA",
method = "PCoA",
distance = "bray",
title = "3um Bray-Curtis PCoA"
)
myord(
physeq = moth_good,
fraction = "22NA",
method = "PCoA",
distance = "bray",
title = ".22um Bray-Curtis PCoA"
)
myord(
physeq = moth_good,
fraction = "CNA",
method = "PCoA",
distance = "sor",
title = "Full community Sorenson PCoA"
)
myord(
physeq = moth_good,
fraction = "100LTR",
method = "PCoA",
distance = "sor",
title = "100um Sorenson PCoA"
)
myord(
physeq = moth_good,
fraction = "53LTR",
method = "PCoA",
distance = "sor",
title = "53um Sorenson PCoA"
)
myord(
physeq = moth_good,
fraction = "3NA",
method = "PCoA",
distance = "sor",
title = "3um Sorenson PCoA"
)
myord(
physeq = moth_good,
fraction = "22NA",
method = "PCoA",
distance = "sor",
title = ".22um Sorenson PCoA"
)
Do a permutational ANOVA on constrained axes used in ordination
anova(cap.ord)
## Permutation test for capscale under reduced model
## Permutation: free
## Number of permutations: 999
##
## Model: capscale(formula = distance ~ ParMC + Nitrate + SRP + LogPhyco + Ammonia + DOC + pH + H2O2, data = data)
## Df Variance F Pr(>F)
## Model 8 3.2952 5.0977 0.001 ***
## Residual 42 3.3936
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
How many reads of different genera are there?
print(cyano_sums)
## Microcystis Synechococcus Pseudanabaena unclassified
## 361375 214905 72053 41016
## Anabaena unclassified Aphanizomenon unclassified
## 22876 15002 327 1030
## Cyanobacterium unclassified unclassified Snowella
## 324 328 369 218
## unclassified Planktothrix unclassified Nostoc
## 908 345 114 18
## Oscillatoria Phormidium Merismopedia Cyanobium
## 37 122 31 6
## Leptolyngbya unclassified Chamaesiphon Calothrix
## 25 98 9 9
## unclassified Planktothricoides unclassified unclassified
## 1 2 68 11
## Chroococcidiopsis Gloeocalita
## 96 4
How many OTUs are there of each genus?
gen <- tax_table(cyanos)[,6]
table(gen)
## gen
## Anabaena Aphanizomenon Calothrix Chamaesiphon
## 9 1 2 3
## Chroococcidiopsis Cyanobacterium Cyanobium Gloeocalita
## 4 1 3 1
## Leptolyngbya Merismopedia Microcystis Nostoc
## 6 1 10 1
## Oscillatoria Phormidium Planktothricoides Planktothrix
## 1 3 1 2
## Pseudanabaena Snowella Synechococcus unclassified
## 5 2 10 257
OTU analysis: There are up to 10 OTUs for different genera, but there is usually only a couple abundant OTUs (others are spurious??). Anabaena has 2, Synechococcus has 4, Microcystis and pseudanabaena have only 1 dominant OTU.
There are no obvious patterns of shifts in OTU populations over time, except for synechococcus. The orange OTU only appears in the first part of the season and the red appears in the second. Also, it seems like synechococcus blooms slightly ahead of microcystis.
Both OTU 49 and 367 had no matches in NCBI to known things
Oligotype analysis: Synechococcus gets the award for most interesting organism this week!!! Already at the OTU level we can see more diversity in syn than any other cyano genera, but after oligotyping there is an astonishing amount of diversity. Is this real?? The BLAST results from each of the oligotypes show mixed results; Some were listed as synechococcus others as unclassified cyanobacteria. I went back to my taxonomy file and got rid of all sequences with a bootstrap value below 90 for synechococcus. The entropy analysis still reveals almost the same amount of diversity
It looks like most changes in community composition are due to changes in relative abundance rather than species turnover
A few problems with MED
MED paper reccomends running it with -M parameter N/10,000 or larger (i.e about 1500 for my dataset) Whats the point of this pipeline if youre going to throw away all of your rare (not even that rare) OTUs?
Entropy is biased based on library size. Since these samples have very different sequencing depths, I would need to normalize before running it to get accurate results
Ultimately MED is just an algorithmic improvement to Oligotyping, not a way to avoid OTU clustering with whole community data